{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "034e355d-83a0-4bd2-877e-0f493c5f713d",
   "metadata": {
    "tags": []
   },
   "source": [
    "# 10K Analysis\n",
    "In this demo, we answer a compare and contrast query with two different approaches:\n",
    "1. structured reasoning via index graph\n",
    "2. free form reasoning via ReAct agent"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 537,
   "id": "298e7dfc-be9a-49bd-8793-9a1d9fa52c8d",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "from llama_index import (\n",
    "    SimpleDirectoryReader,\n",
    "    LLMPredictor,\n",
    "    ServiceContext,\n",
    "    VectorStoreIndex,\n",
    ")\n",
    "from llama_index.response.pprint_utils import pprint_response\n",
    "from langchain import OpenAI"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 419,
   "id": "35364259-f1c3-4df0-b8c9-79e0afca7436",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "llm_predictor = LLMPredictor(\n",
    "    llm=OpenAI(\n",
    "        temperature=0, model_name=\"text-davinci-003\", max_tokens=-1, streaming=True\n",
    "    )\n",
    ")\n",
    "service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9be92b8e-1d3f-4ac9-b88a-10f69e3d6645",
   "metadata": {
    "jp-MarkdownHeadingCollapsed": true,
    "tags": []
   },
   "source": [
    "# Lyft"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 420,
   "id": "d282214f-69d2-49d4-8ab3-e1b1560f0be6",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "lyft_docs = SimpleDirectoryReader(input_files=[\"../data/10k/lyft_2021.pdf\"]).load_data()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 240,
   "id": "60826815-3282-4947-aa4f-ee75dcc033c3",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "lyft_index = VectorStoreIndex.from_documents(lyft_docs)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 421,
   "id": "a0d76160-00fe-4cab-aab9-1f227db6feca",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "lyft_engine = lyft_index.as_query_engine(similarity_top_k=3)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 451,
   "id": "c57dad40-add8-469a-aaeb-fe19900206fe",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "response = lyft_engine.query(\n",
    "    \"what is the revenue growth in the last year, show me the reference page number\"\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 452,
   "id": "fe3ee56e-4209-4e43-9f92-dc0352b1afe5",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Final Response: The revenue growth in the last year is 36%, as\n",
      "referenced on page 63.\n",
      "______________________________________________________________________\n",
      "Source Node 1/3\n",
      "Document ID: 6a1711e2-51be-46c1-8152-72361c192ed9\n",
      "Similarity: 0.8045102189652867\n",
      "Text: page_label: 63  Results of OperationsThe following table summar\n",
      "izes our historical consolidated statements of operations data:Year\n",
      "Ended December 31, 2021 2020 2019 (in thousands) Revenue $ 3,208,323 $\n",
      "2,364,681 $ 3,615,960 Costs and expenses Cost of revenue 1,649,532\n",
      "1,447,516 2,176,469 Operations and support 402,233 453,963 636,116\n",
      "Research a...\n",
      "______________________________________________________________________\n",
      "Source Node 2/3\n",
      "Document ID: c15baf48-6db4-4892-88ba-8be2a6f195a5\n",
      "Similarity: 0.8030623987053549\n",
      "Text: page_label: 19  changing industries. If our assumptions\n",
      "regarding these risks and uncertainties, which we use to plan and\n",
      "operate ourbusiness,  are incorrect or change, or if we do not address\n",
      "these risks successfully, our results of operations could differ\n",
      "materially from our expectations and our business,financial condition\n",
      "and  results of ope...\n",
      "______________________________________________________________________\n",
      "Source Node 3/3\n",
      "Document ID: 21db4c23-fb06-48bb-91c7-2f52d158a123\n",
      "Similarity: 0.8023439081890719\n",
      "Text: page_label: 64  second quarter of 2021. These increases were\n",
      "offset by investments in driver supply by increasing driver incentives\n",
      "recorded as a reduction to revenue by $942.9 million in2021 as\n",
      "compared to the prior y ear as rider demand outpaced driver supply\n",
      "during certain periods of the pandemic recovery in 2021. Revenue in\n",
      "2020 was also hig...\n"
     ]
    }
   ],
   "source": [
    "pprint_response(response, show_source=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e7113434-0e41-46b6-a74e-284ce211fd38",
   "metadata": {
    "jp-MarkdownHeadingCollapsed": true,
    "tags": []
   },
   "source": [
    "# Uber"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 424,
   "id": "e35842e1-b058-4e1a-8f51-4c84b42ff412",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "uber_docs = SimpleDirectoryReader(input_files=[\"../data/10k/uber_2021.pdf\"]).load_data()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 425,
   "id": "4e6c3178-6aab-4fdc-99f6-c820661e7a73",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "uber_index = VectorStoreIndex.from_documents(uber_docs)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 429,
   "id": "7b8e9d46-7d66-4a22-a9d8-4e2ce21491ce",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "uber_engine = uber_index.as_query_engine(similarity_top_k=3)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 430,
   "id": "444ff6cf-2eb6-4f5c-905a-5928d0dc8947",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "response = uber_engine.query(\n",
    "    \"what is the revenue growth in the last year, show me the reference page number\"\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 431,
   "id": "71de4581-5c03-4a08-8d00-ddee08530a0f",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Final Response: The revenue growth in the last year was 57%, as\n",
      "referenced on page 57.\n",
      "______________________________________________________________________\n",
      "Source Node 1/3\n",
      "Document ID: 0c8abeea-57cd-41bf-a94c-1c5d195bb3ae\n",
      "Similarity: 0.8059365413378133\n",
      "Text: page_label: 57  The following table sets forth the components of\n",
      "our consolidated statements of operations for each of the periods\n",
      "presented as a percentage of revenue : Year Ended December 31, 2020\n",
      "2021 Revenue 100 %100 %Costs and expenses Cost of revenue, exclusive\n",
      "of dep reciation and amortization shown separately below46 %54\n",
      "%Operations and ...\n",
      "______________________________________________________________________\n",
      "Source Node 2/3\n",
      "Document ID: 012bb7d8-36ca-42ee-8d9e-9d4754ce599f\n",
      "Similarity: 0.7980283554264312\n",
      "Text: page_label: 60  Provision for (Benefit from) Income TaxesYear\n",
      "Ended December 31, 2020 to 2021 % Change (In millions, except\n",
      "percentages) 2020 2021 Provision for (benefit fro m) income taxes$\n",
      "(192) $ (492) (156) % Effective tax rate 2.8 %48.0 % 2021 Compared to\n",
      "2020 Provision  for (benefit from) income taxes increased by $300\n",
      "million primarily du...\n",
      "______________________________________________________________________\n",
      "Source Node 3/3\n",
      "Document ID: 697ccb91-0b6e-4dfe-b78a-b3115a3bfbe0\n",
      "Similarity: 0.792540481338011\n",
      "Text: page_label: 62  All OtherFor  the  year  ended  December  31,\n",
      "2021  compared  to  the  same  period  in  2020,  All  Other  revenue\n",
      "decreased  $127  million,  or  94%  and  All  Otheradjusted EBITDA\n",
      "loss improved $450 million, or 98%. All  Other revenue and All Other\n",
      "adjusted EBITDA loss improved primarily due to the  favorable impact\n",
      "of the s...\n"
     ]
    }
   ],
   "source": [
    "pprint_response(response, show_source=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "20884bcd-8120-43ed-a137-1ddb3e14d201",
   "metadata": {
    "tags": []
   },
   "source": [
    "# Compare and Contrast - Graph"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 149,
   "id": "c67111e5-4fd3-4586-b51a-fd905486f247",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "from llama_index import ComposableGraph, ListIndex"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 471,
   "id": "13134fbb-85da-4580-871f-373f373f60df",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "graph = ComposableGraph.from_indices(\n",
    "    ListIndex,\n",
    "    children_indices=[lyft_index, uber_index],\n",
    "    index_summaries=[\n",
    "        \"Provides information about Lyft financials for year 2021\",\n",
    "        \"Provides information about Uber financials for year 2021\",\n",
    "    ],\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 472,
   "id": "e01ddd97-3770-44de-a5c6-859d22af9842",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "from llama_index.query_engine.transform_query_engine import TransformQueryEngine\n",
    "from llama_index.indices.query.query_transform.base import DecomposeQueryTransform\n",
    "\n",
    "# define decompose_transform\n",
    "decompose_transform = DecomposeQueryTransform(verbose=True)\n",
    "\n",
    "# define custom query engines\n",
    "custom_query_engines = {}\n",
    "for index in [lyft_index, uber_index]:\n",
    "    query_engine = index.as_query_engine(service_context=service_context)\n",
    "    query_engine = TransformQueryEngine(\n",
    "        query_engine,\n",
    "        query_transform=decompose_transform,\n",
    "        transform_extra_info={\"index_summary\": index.index_struct.summary},\n",
    "    )\n",
    "    custom_query_engines[index.index_id] = query_engine\n",
    "\n",
    "custom_query_engines[graph.root_id] = graph.root_index.as_query_engine(\n",
    "    service_context=service_context,\n",
    "    streaming=True,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 473,
   "id": "63e1f7a3-d792-4103-9351-1a11fe64f6b6",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# define graph\n",
    "g_engine = graph.as_query_engine(custom_query_engines=custom_query_engines)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 489,
   "id": "071ba563-c90d-4984-b111-7fec6687d38d",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[33;1m\u001b[1;3m> Current query: Compare Uber and Lyft revenue growth in 2021.\n",
      "\u001b[0m\u001b[38;5;200m\u001b[1;3m> New query:  What is Lyft's revenue growth in 2021?\n",
      "\u001b[0m\u001b[33;1m\u001b[1;3m> Current query: Compare Uber and Lyft revenue growth in 2021.\n",
      "\u001b[0m\u001b[38;5;200m\u001b[1;3m> New query:  What is Lyft's revenue growth in 2021?\n",
      "\u001b[0m\u001b[33;1m\u001b[1;3m> Current query: Compare Uber and Lyft revenue growth in 2021.\n",
      "\u001b[0m\u001b[38;5;200m\u001b[1;3m> New query:  What is Uber's revenue growth in 2021?\n",
      "\u001b[0m\u001b[33;1m\u001b[1;3m> Current query: Compare Uber and Lyft revenue growth in 2021.\n",
      "\u001b[0m\u001b[38;5;200m\u001b[1;3m> New query:  What is Uber's revenue growth in 2021?\n",
      "\u001b[0m"
     ]
    }
   ],
   "source": [
    "response = g_engine.query(\"Compare Uber and Lyft revenue growth in 2021.\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 490,
   "id": "593ccb02-3cab-4f22-9f59-375dc6627090",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "Uber's revenue growth in 2021 is higher than Lyft's revenue growth in 2021, at 57% compared to 38.2%."
     ]
    }
   ],
   "source": [
    "response.print_response_stream()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1d79414d-9143-4150-a6c9-f4f59852e839",
   "metadata": {
    "tags": []
   },
   "source": [
    "## Compare and Contrast - Agent"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 491,
   "id": "5ddff61c-8d8e-4686-b944-d879f59d4466",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "llm = OpenAI(temperature=0.0, max_tokens=-1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 497,
   "id": "78880292-cdba-4d0c-8e1a-ca9f8f99d562",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "from llama_index.langchain_helpers.agents import (\n",
    "    IndexToolConfig,\n",
    "    LlamaIndexTool,\n",
    "    LlamaToolkit,\n",
    ")\n",
    "\n",
    "uber_config = IndexToolConfig(\n",
    "    query_engine=uber_engine,\n",
    "    name=f\"Uber 10K 2021\",\n",
    "    description=f\"Provides information about Lyft financials for year 2021\",\n",
    "    tool_kwargs={\"return_direct\": False},\n",
    ")\n",
    "\n",
    "lyft_config = IndexToolConfig(\n",
    "    query_engine=lyft_engine,\n",
    "    name=f\"Lyft 10K 2021\",\n",
    "    description=f\"Provides information about Uber financials for year 2021\",\n",
    "    tool_kwargs={\"return_direct\": False},\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 498,
   "id": "07328a05-70d7-47f7-965d-58d4b3a5dbfa",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "toolkit = LlamaToolkit(\n",
    "    index_configs=[uber_config, lyft_config],\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 499,
   "id": "8a37d9cb-bbb6-4b91-9c76-14a8db13c90f",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "from llama_index.langchain_helpers.agents import create_llama_agent\n",
    "\n",
    "agent_chain = create_llama_agent(toolkit, llm, memory=memory, verbose=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 500,
   "id": "361ece8a-e6b8-491e-9a63-f8d3142dd743",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
      "\u001b[32;1m\u001b[1;3m I need to look at the financials of both companies.\n",
      "Action: Uber 10K 2021\n",
      "Action Input: Revenue growth\u001b[0m\n",
      "Observation: \u001b[36;1m\u001b[1;3min 2021 was primarily driven by what?\n",
      "\n",
      "Revenue growth in 2021 was primarily driven by an increase in Delivery Gross Bookings of 71%, or 66% on a constant currency basis, due to an increase in food delivery orders and higher basket sizes as a result of stay-at-home order demand related to COVID-19, as well as continued expansion across U.S. and international markets. Additionally, Mobility Gross Bookings growth of 38%, or 36% on a constant currency basis, due to increases in Trip volumes as the business recovers from the impacts of COVID-19, contributed to the revenue growth.\u001b[0m\n",
      "Thought:\u001b[32;1m\u001b[1;3m I need to compare this to Lyft's revenue growth.\n",
      "Action: Lyft 10K 2021\n",
      "Action Input: Revenue growth\u001b[0m\n",
      "Observation: \u001b[33;1m\u001b[1;3m\n",
      "Revenue increased $843.6 million, or 36%, in 2021 as compared to the prior year, driven primarily by the significant increase in the number of Active Riders in 2021 as compared to the prior year, as vaccines became more widely distributed and more communities reopened.\u001b[0m\n",
      "Thought:\u001b[32;1m\u001b[1;3m I now know the final answer.\n",
      "Final Answer: Uber's revenue growth in 2021 was primarily driven by an increase in Delivery Gross Bookings of 71%, or 66% on a constant currency basis, while Lyft's revenue increased $843.6 million, or 36%, in 2021 as compared to the prior year, driven primarily by the significant increase in the number of Active Riders in 2021 as compared to the prior year.\u001b[0m\n",
      "\n",
      "\u001b[1m> Finished chain.\u001b[0m\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "\"Uber's revenue growth in 2021 was primarily driven by an increase in Delivery Gross Bookings of 71%, or 66% on a constant currency basis, while Lyft's revenue increased $843.6 million, or 36%, in 2021 as compared to the prior year, driven primarily by the significant increase in the number of Active Riders in 2021 as compared to the prior year.\""
      ]
     },
     "execution_count": 500,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "agent_chain.run(input=\"Compare Uber and Lyft revenue growth in 2021.\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 534,
   "id": "822515c6-8266-4d55-bafe-f5262103b68a",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
      "\u001b[32;1m\u001b[1;3m I need to look at the financials of both Uber and Lyft\n",
      "Action: Uber 10K 2021\n",
      "Action Input: Customer segment and geography information\u001b[0m\n",
      "Observation: \u001b[36;1m\u001b[1;3m\n",
      "The information provided in the context does not include customer segment and geography information. It does, however, provide information about our reportable segments and a reconciliation of the total segment adjusted EBITDA to loss from operations, as well as information about revenue and long-lived assets, net by geographic area.\u001b[0m\n",
      "Thought:\u001b[32;1m\u001b[1;3m I need to look at the financials of Lyft\n",
      "Action: Lyft 10K 2021\n",
      "Action Input: Customer segment and geography information\u001b[0m\n",
      "Observation: \u001b[33;1m\u001b[1;3m\n",
      "The Company's customer base is diverse, and its employees reflect that diversity. As of December 31, 2021, the ethnicity of its U.S. employees was 44% White, 30% Asian, 11% Hispanic or Latinx, 8% Black, and 5% two or more races, American Indian, Alaska Native, Native Hawaiian or other Pacific Islander. The Company generates its revenue from its multimodal transportation networks that offer access to a variety of transportation options through the Lyft Platform and mobile-based applications, primarily in the United States.\u001b[0m\n",
      "Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n",
      "Final Answer: Uber and Lyft both have diverse customer bases and generate revenue from their multimodal transportation networks in the United States. Uber does not provide customer segment and geography information, while Lyft provides information about its U.S. employees' ethnicity and the geographic area in which it generates revenue.\u001b[0m\n",
      "\n",
      "\u001b[1m> Finished chain.\u001b[0m\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "\"Uber and Lyft both have diverse customer bases and generate revenue from their multimodal transportation networks in the United States. Uber does not provide customer segment and geography information, while Lyft provides information about its U.S. employees' ethnicity and the geographic area in which it generates revenue.\""
      ]
     },
     "execution_count": 534,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "agent_chain.run(\n",
    "    input=\"Compare and contrast the customer segments and geographies that grew the fastest\"\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "fce81d02-e07b-467a-9682-a128d508c426",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.16"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
